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ABSTRACT 

The galaxy power spectrum encodes a wealth of information about cosmology and 
the matter fluctuations. Its unbiased and optimal estimation is therefore of great 
importance. In this paper we generalise the framework of Feldman et al. (1994) to 
take into account the fact that galaxies are not simply a Poisson sampling of the 
underlying dark matter distribution. Besides finite survey-volume effects and flux- 
limits, our optimal estimation scheme incorporates several of the key tenets of galaxy 
formation: galaxies form and reside exclusively in dark matter haloes; a given dark 
matter halo may host several galaxies of various luminosities; galaxies inherit part 
of their large-scale bias from their host halo. Under these broad assumptions, we 
prove that the optimal weights do not explicitly depend on galaxy luminosity, other 
than through defining the maximum survey volume and effective galaxy density at a 
given position. Instead, they depend on the bias associated with the host halo; the first 
and second factorial moments of the halo occupation distribution; a selection function, 
which gives the fraction of galaxies that can be observed in a halo of mass M at position 
r in the survey; and an effective number density of galaxies. If one wishes to reconstruct 
the matter power spectrum, then, provided the model is correct, this scheme provides 
the only unbiased estimator. The practical challenges with implementing this approach 
are also discussed. 

Key words: Cosmology: large-scale structure of Universe. 


1 INTRODUCTION 

The power spectrum of matter fluctuations, or equivalently the two-point correlation function, is a fundamental tool for 
constraining the cosmological parameters. It contains detailed information about the large-scale geometrical structure of 
space-time, the constituents of energy-density and their evolution with redshift, and also provides us with information about 
the primordial scalar fluctuation spectrum. However, we do not directly observe the matter density field, instead we observe 
galaxy angular positions and measure radial velocities, or redshifts, from spectra. Given a galaxy redshift survey, two things 
are crucial: how to obtain an unbiased estimate of the information in the matter fluctuations; and obtaining an estimate that 
has the highest signal-to-noise possible, i.e. an optimal measurement. 

The first point may be rephrased as the need to understand the relation between galaxy and matter fluctuations - more 
commonly referred to as galaxy bias. The second point, that of optimality, in fact also relies on our understanding of bias, 
since only through knowing how the galaxies are embedded in the mass distribution can one devise efficient survey strategies; 
for example, if all galaxies formed in pairs then one would only require information about one galaxy from each pair to obtain 
all of the useful cosmological information. 

The development of galaxy correlation functions as a tool for constraining the cosmological model was first realized by 
Peebles and collaborators in a series of pioneering papers in the 1970s (Peebles 1973; Hauser & Peebles 1973; Peebles & Hauser 
1974; Peebles 1974; Peebles & Groth 1975; Peebles 1975; Seldner & Peebles 1977; Groth & Peebles 1977; Fry & Peebles 1978; 
Seldner & Peebles 1978, 1979; Fry & Peebles 1980). Subsequent studies built on this, and estimators were developed that 
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took better account of fluctuations in the mean number-density of galaxies (Davis & Peebles 1983; Landy & Szalay 1993; 
Hamilton 1993; Bernstein 1994). These new estimators were only optimal in the case that the clustering was very weak and 
when galaxies represented a Poisson sampling of the underlying matter fluctuations. 

The development of techniques for the direct estimation of the galaxy power spectrum began in earnest in the early 1990s 
(Baumgart & Fry 1991; Peacock & Nicholson 1991; Fisher et al. 1993). This culminated in the seminal work of (Feldman et al. 
1994, hereafter FKP). In their seminal approach, galaxies were assumed to be a Poisson sampling of the mass density field. 
They showed that provided one subtracted an appropriate shot-noise term, and deconvolved for the survey window function, 
one could obtain an unbiased estimate of the matter power spectrum. Subsequent analysis focused on obtaining quadratic 
and decorrelated band estimates (Vogeley & Szalay 1996; Hamilton 1997a,b, 2000; Hamilton & Tegmark 2000). 

In the last two decades our understanding of galaxy formation has made rapid progress and the current best models 
strongly suggest that galaxies are not related to the underlying dark matter in the simple way that was envisioned in FKP. 
(White & Rees 1978; White & Frenk 1991; Kauffmann et al. 1999; Benson et al. 2000; Springel et al. 2005). Furthermore, 
improved observational studies have subsequently discovered that galaxy clustering is in fact dependent on a number of 
properties of the galaxy distribution: e.g. luminosity (Park et al. 1994; Norberg et al. 2001, 2002; Zehavi et al. 2002a, 2005a; 
Swanson et al. 2008; Zehavi et al. 2011a), colour (Brown et al. 2000; Zehavi et al. 2002b, 2005b; Swanson et al. 2008; Zehavi 
et al. 2011b), morphology (Davis & Geller 1976; Guzzo et al. 1997; Norberg et al. 2002), stellar mass (Li et al. 2006) etc. 

Percival et al. (2004, hereafter PVP) attempted to correct the FKP framework to take into account the effects of luminosity 
dependent bias. To this end, PVP assumed that the probability of finding a galaxy of a given luminosity in a certain patch 
of space would be a Poisson variate, whose mean was proportional to the local density of dark matter multiplied by a 
luminosity-dependent bias factor. Their work demonstrated two important facts: firstly that an optimal weighting scheme 
depended sensitively on the assumptions about the bias and secondly, if their assumptions about the bias were correct, the 
FKP method was a biased estimator of the matter power spectrum. 

In this paper we argue that the approach of PVP, whilst qualitatively reasonable, is in fact still at odds with our 
current understanding of galaxy formation and therefore unlikely to be the true optimal estimator. The key ideas from galaxy 
formation and evolution that we would like to build into our estimator are: galaxies only form in dark matter haloes (White 
& Rees 1978); haloes can host a number of galaxies of various luminosities; the large-scale bias associated with a given 
galaxy, is largely inherited from the bias of the host dark matter halo. In a recent paper (Smith & Marian 2014, hereafter 
SM14), we generalised the FKP formalism to account for the clustering of galaxy clusters - which turned out to have a 
similar mathematical structure to the PVP scheme. We now undertake to generalise the FKP formalism to take into account 
these ideas from galaxy formation. As we will show, these effects will lead us to a new optimal estimator and method for 
reconstructing the matter power spectrum. 

Before moving on, it is worth noting that current state-of-the-art galaxy redshift surveys, such as the Baryon Oscillation 
Spectroscopic Survey (Anderson et al. 2012, 2014b,a, hereafter BOSS), Galaxy And Mass Assembly (Blake et al. 2013, hereafter 
GAMA), and WiggleZ (Blake et al. 2011), have all used the FKP power spectrum estimation procedure. Future surveys, such 
as DESI (Levi et al. 2013), Euclid (Laureijs et al. 2011) and SKA (Blake et al. 2004), will have significantly larger volumes 
and so unbiased and optimised data anlaysis will be crucial if we are to obtain the tightest constraints on the cosmological 
parameters. 

The paper is broken down as follows: In §2 we describe generic properties of a galaxy redshift survey and present a new 
theoretical quantity, the halo-galaxy double-delta expansion. We explore its statistical properties. In §3 we show how one 
may obtain unbiased estimates of the matter correlation function and power spectrum. In §4 we derive the covariance matrix 
of the fluctuations in the galaxy power spectrum. In §5 we derive the optimal weights. In §6 we present a new expression 
for the Fisher information matrix for optimally-weighted galaxy power spectra. In §7 we enumerate the steps for a practical 
implementation of this approach. Finally, in §8 we summarize our findings and draw our conclusions. 


2 SURVEY SPECIFICATIONS AND THE Jg-FIELD 
2.1 Preliminaries: a generic galaxy redshift survey 

Let us begin by defining our fiducial galaxy survey: suppose that we have observed Ag°* galaxies and to the ith galaxy 
we assign a luminosity Li, redshift Zi and angular position on the sky fti = n(0i, <();). If we specify the background FLRW 
spacetime, then we may convert the redshift into a comoving radial geodesic distance Xi = xi^i)- galaxy’s comoving position 
vector may now be expressed as Vi = r(xi, Hi). 

The survey mask function depends on both the position and luminosity of galaxies, given an adopted flux limit. In this 
work we shall take the angular and radial parts of the survey mask function to be separable, though this assumption does not 
change our results: 

e(r|L) = e(H)e(x|L). (i) 
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Note that if the flux-limit is not uniform across the survey then the radial function 0(x, L) would still be a function of the 
angular position vector f2, and the survey mask cannot be separated as in the equation above. The angular part of the mask 
may be written as: 


e(n) = 


1 ; 

0 ; [otherwise] ’ 


( 2 ) 


where is the set of angular positions that lie inside the survey area. The radial mask function may be written: 


e(x|T) = 


1 i [X ^ Xmax(T)] 

0 ; [otherwise] 


(3) 


where Xmax(T) is the maximum distance out to which a galaxy of luminosity L could have been detected. 

The survey volume for galaxies with luminosity L is simply the integral of the mask function over all space: 


V^{L) = I Q{T\L)dV, (4) 

where dV is the comoving volume element at position vector r (for a flat universe dV = cPr = yfdO.dx) ■ In what follows it 
will be also useful to note that the relation Xmax(T) may be inverted to obtain the minimum galaxy luminosity that could 
have been detected at radial position xiO in Ih® survey. We shall write this as: 

[Linin(r)//i“^L0] = [dL(r)/h“^Mpc] ^ , (5) 


where is the apparent magnitude limit of the survey, Mq is the absolute magnitude of the sun, h is the dimensionless 
Hubble parameter and dr is the luminosity distance (for a flat universe di^{z) = (1 -|- z)x{z)). Thus for any general function 
S(x, T), we have the useful integral relations: 



dxQ{x\L)B{x,L) 



dxBix,L) 



dLB{x,L) . 


( 6 ) 


2.2 The halo-galaxy double-delta expansion 

Our understanding of galaxy formation tells us that galaxies form exclusively in dark matter haloes, and that each dark 
matter halo may host several galaxies of various luminosities. It therefore follows that the large-scale bias associated with any 
given galaxy is directly proportional to the bias of the host halo. We shall mathematically encode these ideas in our density 
field as follows: our A^g°* galaxies are distributed inside Nh dark matter haloes. Thus the ith dark matter halo of mass Mi 
and centre of mass position x^ will host a number of galaxies that depends on its mass, Ng{Mi). The j'th galaxy will have a 
position vector relative to the centre of the halo and a luminosity Lj. 

In order to study the statistical properties of the galaxy field in this scenario we need to simultaneously account for both 
the spatial distribution of the haloes, as well as the galaxies inside them. We therefore introduce a new function, dubbed the 
‘galaxy-halo double-delta expansion’. This is a Dirac delta function expansion over the halo positions and masses, as well as 
over the positions and luminosities of the galaxies inside the haloes. It is written: 

ng(r, L, X, M) = ^(5°(x — Xi)(5°(M — Mi) ^ 6°(r — Vj — ■>ii)S°{L — Lj) , (7) 

i=i j^i 

where Nh is the total number of host dark matter haloes in the survey volume and N^{Mi) is the total number of galaxies in 
the ith halo. In the above function, the order of variables is important: r refers to the spatial vector in the galaxy field, L the 
luminosity, x the spatial vector in the halo field, and M the halo mass. Note that the units of the above function are inverse 
squared volume, inverse mass, and inverse luminosity 

Next, in analogy with SM14, we define a field J^g, which is related to the over-density of galaxies. Our survey will be 
finite and will contain masked regions and an apparent magnitude limit mum. Hence, the overdensity field of galaxies with 
magnitudes above some threshold luminosity may be written using Eq. (7) in the following way: 

Jg(r) = ^ dL J d^x dMe{v\L) [ng(r, L, x, M) - ans(r, L, x, M)] , (8) 

where w{r, L,x, M) is a weighting function that we will wish to determine in an optimal way, and H is a normalisation 
parameter that will be chosen later. 

The function na(r, L, x, M) is the random galaxy-halo double-delta expansion. This immediately leads to an important 


^ We note that this equation is the more rigorous starting point for all Halo Model calculations of the galaxy field. However, so far as 
we are aware it has not been written down before. This in part owes to the fact that the galaxy-clustering expressions could be deduced 
by analogy with the mass clustering. However, for the case of the optimal weights in a realistic survey that approach is not feasible. 
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question: what constitutes a random catalogue? Conventionally, in the FKP approach one would distribute the mock galaxies 
randomly within the survey volume - preserving the number counts as a function of redshift. However, since we know (or have 
assumed in this model) that galaxies form only inside dark matter haloes, we do not want to remove this property. Instead it 
is the dark matter haloes which should be randomly distributed, and not the galaxies. Therefore, the function na(r,L,x, M) 
represents the distribution of galaxies in a mock sample whose dark matter haloes possess no intrinsie spatial correlations, and 
have a number density that is Xjot of the true galaxy-halo double-delta field. Note that for this random distribution, while 
the halo centres are not correlated, the galaxies still follow the density distribution inside each dark matter halo. In addition, 
the haloes possess a mass spectrum and the galaxy luminosities are conditioned on the halo mass. 

Both quantities defined by Eqs. (7) and (8) are of central importance and will be extensively used in this paper. It is 
therefore worthwhile for us to take some time to understand their meaning and how one should employ them to infer the 
statistical properties of the galaxy density field. This we do in the following section. 


2.3 Calculation of the expectation of the galaxy density field 

As a demonstration of how one can use the halo-galaxy double-delta expansion and take statistical averages we calculate the 
expectation of T^. We use Eq. (8) to break (J^g(r)) into two parts: 


(J-g(r))= / dLe(r|L)KA'g(r,L))-a(A;(r,L))] , 

where we introduced the weighted mean number density of galaxies per unit luminosity, at the spatial position r(x, H): 


(9) 


(A4(r,i)) = 


’ ^ w(r, L, X, M) , , , „ 

dM ~^ — ^ {ng{r, L, x, M)) 


/V5°(x-Xi)5°(M-Mi) V 5°{r - Vj - {L - Lj) 

J Jo VA 




( 10 ) 


with a similar expression for (A/'s(r, L)). The function in Eq. (10) is related to the galaxy luminosity function. 

To proceed further we now need to understand what taking the ‘expectation value’ actually means. Following Smith 
(2012), this operation can be broken down into three steps. First, the fluctuations in the underlying dark matter field are 
sampled - we shall denote this averaging through a sub-script s. Second, given the dark matter field, the haloes may be 
obtained as a sampling of the density field - we shall denote this operation through sub-script h. Third, given a set of dark 
matter haloes, and sufficient knowledge of the properties of the halo, galaxies may then be sampled into each halo - we shall 
denote this operation through sub-script g. Hence, Eq. (10) can be rewritten, 

/NgiMi) 


(Afg(r,L)) = I d\ (^^d°(x-x0d°(M-M0^ 5°(r - r, - x05°(T - T,)^ ^ 


( 11 ) 


Let us now compute the average over the galaxy sampling for the ith dark matter halo: 

/Ng(Mi) \ oo . 

t=i / g 


\ OO p 

)) = ^ P(iVg|A(M0) / n {d^rkdLk} p{ri ,..., rjVg, Li,.. ■, Lvg|Mi,Xi) 

/ „ iVg=0 •' k=l 


X |^(5°(r - n - Xi)5°(L - Li) H--I-5°(r - rvg - Xi)5°(L - LjVg)] , (12) 

where in the above we have introduced the following quantities: P(Ag|A(Mi)) is the discrete probability that there are 
Ag galaxies in the ith dark matter halo and this we assume depends on some function of the dark matter halo mass 
p(ri,..., rjVg, Li,, LjVg lAfi, Xi) is the joint probability density function for finding the Ag galaxies being located at positions 
{ri,..., rjVg} relative to the halo centre Xi, and with luminosities {Li,..., Lvg}, conditioned on Mi and Xi. We have assumed 
that the properties and distribution of the galaxies in the ith halo are independent of all other external haloes. If the probability 
for finding a galaxy at a given position inside a halo is determined by the density prohle of the matter in the halo, and if the 
probability that the galaxy has a luminosity L depends only on the halo mass, then this joint probability can be written in 
the following manner: 

JVg iVg 

p(ri,..., rvg, Li,... ,LjVg|Mi,Xi) = {p(rfe|M„ x0p(Lfe|M0} = H {t^(rfc|M„ x0<E>(Lfe|M0} , (13) 

fc=l k = l 

where in the above equation we have used the density profile of galaxies in the halo, normalised by the total number of galaxies 
in that halo, U, to define 

p(r|M,x) = l/(r|M,x) = pg(r|M,x)/Ag(M) . (14) 

We have also used 

p{Lk\MS = ^{Lk\Mi) , (15) 
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as the probability density that a galaxy hosted by a halo of mass M, has a luminosity L In writing Eq. (13) we have assumed 
that, for a given galaxy, its spatial location inside the dark matter halo is independent of its luminosity. As will be shown 
later, this assumption will not be crucial for the derivation of the optimal weights. 

On integrating over the Dirac delta functions in Eq. (12) we find 

/Ng(Mi) \ oo 

5°{v-r,-y,,)S°{L-L,)) = ^ P(Arg| A(M,))fV9l7(r - x,|M0<l-(L|Mi) 




JVg=0 

— atWi 


= , (16) 

where we have suppressed the dependence of U on the halo centre. The second equality follows from the definition of the first 
factorial moment of the galaxy distribution: 


P{N,\X{M0)N, . 


(17) 


Returning to our main calculation, on substituting the last two equations into Eq. (11), we now obtain 

' JVh 


(A'g(r, L)) = j ^^ w(r,^,M) _ x,)5D(m - Mi)Nf^\Mi)Uir - Xi\Mi)${L\Mi)j 

= J ^^ ic(r’^,M) J ^3^^ ^ d^xN^dMi... dMjVhP(xi,... xjvh, Mi,..., MjvJ 


X ^ 5°(x - xO<5°(M - Mi)Nf^\M,)U{r - x,|Mi)<E>(T|Mi) . 


(18) 


In the above, we followed SM14 to introduce p{xi,... Mi,..., MnO as the joint probability density for the A^h dark 
matter halo centres being located at positions {xi,..., xat^}, and with masses {Mi,..., MnO- integrating over the Dirac 
delta functions, the mean number density of galaxies becomes, 


(Afg(r,L)) = J jVhp(x, M)N^f-\M)U(v - x|M)$(L|M) . 


(19) 


The joint distribution function for obtaining a halo of mass M at position x can be written as the product of two independent 
one-point probability density functions (Smith & Watts 2005): 

h(M) 


p(x, M) = p(M)p(x) = X 


Ni, ’ 

where n{M) is the mean mass function of dark matter haloes, which tells us the number density of haloes of mass M, per 
unit mass, and hh = A^h/Vji is the mean number density of haloes. On substituting this expression into Eq. (19) we find that 
the mean density of galaxies, per unit luminosity, at spatial location r may be written: 


(A'g(r,L)) = ^(()„(r,L) 
where we have defined 


( 21 ) 


( 22 ) 


()>„(r,L) = ^ dMn[M)N^^\M)^{L\M) j d^xw{r, L,x, M)U{r - x\M) . 

If we were to set the weight function to unity, the above expression would be the galaxy luminosity function (Yang et al. 
2003): 


poo 

(l)iL)= dMn{M)Nf^'>{M)^{L\M) . 

Jo 


(23) 


Turning to the second expectation value in Eq. (9), we note that the only difference between {J\fg{r, L)) and (A/'s(r, L)) is 
the artificially increased space-density of clusters and the absence of any intrinsic clustering. Hence, we also have, 

a (A4(r, L)) = ^((i„(r, L) . (24) 

y/A 

Returning to Eq. (9) and inserting Eqs. (21) and (24) we arrive at the result: 

(Pg(r)) = 0 . (25) 

Hence, the Pg-field, like the over-density field of matter, is truly a mean-zero field. 

Note that we have neglected to take into account the statistical properties of obtaining the Ah clusters in the survey 


^ Note that this is closely related to the conditional luminosity function introduced by Yang et al. (c.f. 2003), which in our notation 

would be -I-Yangetal(7|M) = N^f\M)^{L\M). 
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volume. In what follows we shall assume that the survey volumes are sufficiently large that this may be essentially treated as 
a deterministic quantity. However, it can be taken into account (e.g. see Sheth & Lemson 1999; Smith & Watts 2005; Smith 
2009). 


3 CLUSTERING ESTIMATORS 

We now move on to the more interesting problem of using the halo-galaxy double-delta expansion to compute the clustering 
properties of the galaxy distribution. We begin first with the correlation function, and then through Fourier transforms look 
at the power spectrum. This task will be somewhat laborious, however it will enable us to develop and establish a number of 
important concepts and results. 


3.1 The two-point correlation function of galaxies 

The two-point correlation function of the field can be computed using our double-delta expansion through: 

{J-g(ri)J-g(r2)) = i J dLidL2d^xid:^X2dMidM20{ri\Li)Q{r2\L2)w{ri, Li, xi, Mi)w(r2, 1 / 2 , X2, M 2 ) 

X 1 ^ (ng(ri,Li,xi,Mi)ng(r2,I/2,X2,M2)) - a (ng(ri, Li, xi, Mi)ns(r2, L2, X2, M2)) 

- a (ns(ri, Li,xi, Mi)ng(r 2 ,1/2,X2, M 2 )) -|- (ns(ri, Li,xi, Mi)ns(r 2 ,1/2,X2, M 2 ))j . (26) 

The expectation terms in the square bracket on the right-hand side of this equation can be evaluated in a similar manner as 
was done for the case of the mean density. In Appendix A, we provide a detailed derivation of the terms (nging 2 ), (nsing 2 ), 
(ngins 2 ) and {nsina 2 }- On substituting Eqns (A9), (AlO), (All) and (A12) into Eq. (26) and on integrating over the delta 
functions, we find that the correlation function may be written as the sum of three terms: 

(J-g(ri)J-g(r2)) = ]^|yd"xMMih(M,)&(Mi)iV^'HMi)W('()(ri,Xi,Mi)|e(|xi -X 2 I) 

-f(l-fa) J d^a;dMh(M)Af>(M)W(()(ri,x,M)>V(^)(r2,x,M) 

-f(l-fa) J d^a:dMh(M)Aj(i>(M)W(^)(ri,x,M)5°(ri-r2) , (27) 

where b{M) is the large-scale linear bias of dark matter haloes, ^(x) is the dark matter correlation function, and Ng^\M) 
is the second factorial moment of the galaxy numbers (for more details on these quantities see Appendix A). In the above 
expression we have also defined the quantity: 

(r, X, M) = U‘(r- x|M) Wfp (r, x, M) , 

with 

Wp)(r,x,M) = ^ J dL‘i>{L\M)Qir\L)w\v,L,^,M) . 

Based on the above analysis, we see that an obvious estimator for the J-g correlation function is, 

irg{r) = J d^r'Tg{v')Tg{r+ r') ; (r / 0) . (30) 

The expectation of the estimator is: 


(28) 

(29) 


= 


n [d\. dM,h(M06(M,)iV(i)(M0} ^(1 Xl - X 2 I) J d^r'Vy(^) (r', Xl, Mi)W’(^)(r-f r',X2, M 2 ) 


‘I- 


H-(lH-a) / axdMn 


(fr Vy(i) (r , X, M) W(i) (r + r , x, M) ; (r / 0). 


(31) 


In general is a biased estimator for the matter correlation function Thus, in order to make a robust comparison between 
theory and observations, one must either compute the theory predictions as in Eq. (27) or generate Monte-Carlo mock samples 
and use the same estimator to compare theory and observation. 


3.2 An estimator for the matter correlation function in the large-scale limit 

In the large-scale limit, the clustering of galaxies can be used to obtain an unbiased estimate of the matter correlation 
function. To see this note that, since the dark matter haloes in simulations are cuspy, on large scales the mass- or galaxy- 
number-normalised density profiles of galaxies behave approximately as Dirac delta functions. We shall therefore take 

U^^{r\M) ^ S°{r) , (32) 
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and on implementing this in Eq. (31), we find after integrating over the Dirac delta functions: 

(cVg(r)} « ^(r) / l[[dMin{M,)biMi)Nf^\Mi)} f dV'W(i)(r', r', Mi) (r + r', r + r', M 2 ) ; (r / 0) (33) 

In this limit there are a number of interesting things that happen: firstly, the weight function has now become independent 
of the halo positions, i.e. we no longer differentiate between galaxy and halo positions, taking them to be the same. Hence, 
we may now write: 

w(r , L, X, M) ^ (r, L, M) ; W(o (r, x, M) ^ (r, M) . (34) 

Secondly, the dark matter correlation function has separated out and so we may easily invert Eq. (33) to obtain an unbiased 
estimate for the dark matter clustering. The estimator is: 

CVjr) 


g(r) ~ ; So(r) = y d\'e;|^^\j(r')e(i\\)(r+ r') ; (r / 0) 

where we have defined the new set of weighted window functions: 

= j dMn{M)b^{M)N^^\M) (r, M)]" . 


(35) 


(36) 


The function Eo(r) represents the correlation function of the averaged survey window functions. Our correlation function 
estimator Eq. (35), is therefore a generalization of that of (Tandy & Szalay 1993). 

3.3 The galaxy power spectrum 

We now turn to the Eourier space dual of the correlation function - the galaxy power spectrum. To obtain this let us begin 
by defining our 3D Fourier transform convention for a function B and its inverse as: 

H(k) = I d\B{vy^- ^ B{v) = I ^H(k)e-*‘‘- . 

We distinguish real- and Fourier-space quantities that share the same symbol through use of the tilde notation. We also define 
the power spectrum Psik) of any infinite statistically homogeneous random field B(k) to be: 


.B(k)H(k')) = (27r)®5°(k + k')Ps(k) . 


Note, if the field B were statistically isotropic, the power spectrum would simply be a function of the scalar k. In addition 
the power spectrum and two-point correlation function of the field B form a Fourier pair: 

?r.(|x-x'|)=| ^ J yV P,(q)(2.)^5°(q + q')e---e--'-' . 

With these definitions in hand we may now transform Eq. (27), and on considering the case where k 2 = —k, we find that the 
expectation of the square of the amplitude of the Fourier modes of J-g is given by: 

(|.Pg(k)|") = I ^P(q)n|ydM,h(M 06 (M)ivW(Mo|w('()(k,-q,Mi)Wfi)(-k,q,M 2 ) 


+(l + a) / ^dMh(M)iVf)(M)W('()(k,-q,M)W('i)(-k,q,M) 


-b(l-ba) J dVd^a;dMh(M)N'^^^(M)W(^)(r,x,M) 


(37) 


where P(q) is the matter power spectrum and we have set the Fourier transform of the effective survey window function to 
be: 


VV( 0 (k, q,M) = J d^rd^xW(i){r,x,M)e'^'''e‘^'"^ . 


(38) 


As in the case of the correlation function of the field Pg, its power spectrum does not provide a direct estimate of the matter 
power spectrum. 


3.4 The power spectrum in the large-scale limit 

Let us now consider the power spectrum of Pg in the large-scale limit. As discussed in §3.2 we expect the density profiles to 

behave like Dirac delta functions in real space, in Fourier space the density profiles on large scales simply obey: U (k|M) - y 1. 

Under this condition Eq. (37) simplifies to: 
j3. 


(|.Fg(k)|^)« I ^P(q)|gWj(k-q)|' + P,hot , 


(39) 
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where the second term on the right-hand side is a fc-independent effective shot-noise term, 

= (1 + a) (0) + eg’o) (0)] • (40) 

In the limit that the snrvey volume is large, the window functions will be very narrowly peaked around k = 0. 

Provided the matter power spectrum is a smoothly-varying function of scale, the window functions (k) will behave in a 

way that is similar to the Dirac delta function. Hence, Eq. (39) becomes: 


l-^g(k)|- 


pm/ 


1^(1) 

(271 




(k - q) -I- Pal 


(41) 


Let us focus on the integral factor on the right-hand-side of the above expression. If we now perform the transformation 
of variables q —>■ k — q and use Parseval’s theorem, we find: 


1,^(1) 


(27r)3 


5i,\)(k-q) = 




(42) 


Upon back-substitution of Eq. (36) into the above expression, we obtain: 

J dV|gg\j(r)p = \ J j dMn{M)M^\M)h(M) j dL<i>{L\M)Q{r\L)w{r,L,M) 

Note that we have not yet specified the parameter A, which we now take to be: 

A = j d^r j dMn[M)NP [M)b{M) J dL^{L\M)e{r\L)w{r,L,M) . 

Note, we will now drop the super-script LS notation for w and VV’(i), since for the remainder of this study we shall be working 
only in the large-scale limit. Thus, our estimator for the matter power spectrum can be written simply: 

P(k) = |.Pg(k)|" - P,hot . (43) 

If our modelling of the galaxy distribution is correct, then the above estimator constitutes the only unbiased estimator of the 
matter power spectrum. Before proceeding further, note that in the above expression we have obtained the power spectrum per 
mode. In fact, we are more interested in its band-power estimate. Hence, our final estimator in the large-scale and large-volume 
limit is: 


P{ki) = ^f d^kP{k)=^f d3fc|.Pg(k)|"-P,hot 
D Jvi Jvi 


where in the above we have summed over all modes in a fc-space shell of thickness Afc and volume 




r pfci-l-Afc/2 1 / Ah\ 

/ d^k = 477 / k^dk = 47rfe?Afc 1 + — ( ^ ) 

Jvi Jki-Ak/2 12 \ ki J 


(44) 


(45) 


4 STATISTICAL FLUCTUATIONS IN THE GALAXY POWER SPECTRUM 


In order to obtain the optimal estimator we need to know how the signal-to-noise (hereafter S/Af) varies when we vary the 
shape of our weight function w. Thus, we need to understand the noise properties of our power spectrum estimator, i.e. 
compute its covariance matrix. The covariance matrix of two band-power estimates is given by: 

Co^[P{h),P{kP] = {P{h)P{kP)-(P{h)){P{kj)) = yJ^Pkl^ l^d\2C0Y[p{kr),P{k2)] , 


where the last factor on the right-hand side of the above expression is the covariance of the power in two separate Fourier 
modes. For the case of large survey volumes and in the large-scale limit the matter power spectrum is given by Eq. (44). 
Hence, 

Cov[p(ki),P(k2)] R7 Cov[|.Pg(kl)|M.Pg(k2)|^] =(|.Pg(ki)n.Fg(k2)|^)-(|.Pg(kl)|2)(|Pg(k2)|^) . (46) 

The approximation in the equation above follows from the discussion in Appendix Bl. 

In Appendix C, we derive a general expression for the covariance matrix of |.Fg(ki)p and |.Fg(k 2 )|^, with all n-point 
connected spectra and shot-noise terms included - this is obtained by combining Eqs. (Cl) and (C2) with Eq. (C13). Under 
the assumption of a Gaussian matter density field, our general expression simplifies to Eq. (CIS). Furthermore, we also show 
in Appendix C3 that in the large-scale limit Eq. (CIS) can be written as: 


Cov[|.Pg(kl)|^ |Pg(k2)|"] = I ^P(q)g(« )(ki + q)e(g\)(k2 - q) + (1 + o) [ 4 \>o)(kl + k2) + egg)(ki + k2)] 

+ I (ki + (-k2 - q) + (1 + a) pgg)(ki - k2) + (ki - k2)] 


(47) 
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"(n) 

In the limit where the survey volume is large, the functions are very narrowly peaked around A: = 0. Furthermore, if 

the power spectrum does not rapidly vary over the scale of the effective window function, then we may treat it as a constant 
in Eq. (47). Thus, 

Cov[|.Fg(kl)|^ \T,{k2)f] « + k2) + (1 + a) [Q|?Jo)(ki + k2) + Q|2|o)(ki + k^)] |' 


+ |p(ki)Q[i;iJ, ,)(ki - k2) + (1 + a) [Q[?|o)(ki - k2) + Qg|o)(ki - k2)] | 


where we have introduced the functions: 

irl 


(48) 

(49) 


and made use of the convolution theorem to write their Fourier transforms: 


Q 


("1,11-2) 

(h,h\mi,i"2) 




Note that we also used the trivial identity 

Returning to Eq. (46), we find that after substitution of Eq. (48) into Eq. (46), the bin-averaged estimates of the power 
spectrum can be written: 

Cov[P(A:0,P(A:,)] =2^ |p(kOQ|i;i|, .^(ki+k^) + (1 + a) [Q[?Jo)(ki + k^) + ^(ki + k 2 )] |'. (50) 

Eq. (50) follows from the integrals over k 2 in Eq. (48) being invariant under the transformation k 2 —>■ —k 2 . Eurthermore, if the 
fc-space shells are narrow compared to the scale over which the power spectrum varies, then the shell-averaged power spectrum 
can be pulled out of the integrals. In Appendix D we detail the computation of Eq. (50) and show that the covariance can be 
reexpressed as: 


Cov[P(ki),P(k,)] 


2(2 


Vi 


^Lp\ki)5^, 




P{ki) 



(51) 


with the functions defined by Eq. (36). 


5 OPTIMAL ESTIMATOR 

Our aim is to find the optimal weighting scheme that will maximize the S/Af ratio on a given band-power estimate of the 
galaxy power spectrum. 


5.1 The optimal weight equation 


To begin, note that maximizing the S/Af ratio is equivalent to minimizing its inverse, the noise-to-signal ratio Af/S. The 
square of the latter can be expressed as: 


P[«;(r,L,M)] 


Cp {ki) 

P\ki) 


2{2Tvf 

Vi 




(j_+Q) 

P{ki) 



(52) 


In the above expression, we have written the squared noise-to-signal P[iu] as a functional of the weights w{r,L,M). The 
standard way for finding the optimal weights is to perform the variation of the functional F with respect to the weights 
w{r,L,M). Operationally, the functional variation of F[w] is carried out by comparing E[m;] with the functional obtained for 
weight functions that possess a small path variation w(r, L, M) —>■ w{v, L, M) -\- Sw{r, L, M). This variation can be defined: 


JEH = F[w{r, L, M) -f 5w{r, L, M)] - E[M;(r, L, M)] = J dPr dL 


dM 


5F 


5w{r, L, M) 


Sw{r, L, M) 


(53) 


Extremisation means that the functional derivative is stationary for small variations around the optimal weights: 


5ui(r, L, M) 


(54) 


Recall that the definition of the weights in Eq. (29) includes the normalization constant A specified by Eq. (42). Since the 
normalization constant A is itself a function of the weights, it follows that E[w] is in fact a ratio of two weight-dependent 
functionals: 


F[w] 


V/uo] 


( 55 ) 
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with the definitions: 


A/’[w] 

Vlw] 



[5(l!i)(r)] + c [e[?!o)(r) + 5(2lo)(r)] 



In the above, we introduced the scaled effective window functions: 



(56) 

(57) 

(58) 


as well as the constant c = {1 + a)/P{ki), which helps keep the equations as compact as possible. We also dropped the overall 
constant 2{2tv)^ /Vi from the functional Wfui], since it plays no role in the minimization process. 

Minimizing Fjm] is equivalent to solving the functional problem: 

= 0 (5A/'[iu] - F[w]5V[w] = 0. (59) 

Therefore, to find the optimal weights satisfying Eq. (59), we first need to compute the variations of JV and P with a 
perturbation Sw. This calculation is outlined in Appendix E. Putting together Eqns. (59), (E6), (E7), we arrive at the 
following general equation for the optimal weights: 

{ [g[Z) (r)]' + c (r) + ZZo) (r)] } {^(yi) (r)6(M) + c [n;(r, L, M) + Wi (r, M)fi{M)NA'> (M)] } = (r) h{M) . (60) 

In the above, Wi was introduced by Eq. (E4), and the function l3{M) specifies the relation between the first and second 
factorial moments of galaxies in a halo of mass M, as discussed in Cooray & Sheth (2002): 

= P{M) . (61) 

Note that for a Poisson distribution, /3 = 1, although we do not make this assumption here. 

On inspection of Eq. (60) we notice that, with the exception of the weights w{r,L,M), none of the terms carries any 
explicit dependence on the luminosity of the galaxies. We therefore conclude that the optimal weights are independent of 
luminosity. Hence, without any loss of generality, we may now redefine the weights to be: 


w{v, L, M) => w{v, M) . (62) 

One immediate consequence of this is that the functions W(i)(r, M) can now be written in the much simplified form: 

dL<l>(L|M)e(r|L) = w'(r,M)5(r,M) . 

In the above we have introduced the function: 

/’oo noo 

5(r,M) = e(n)/ dLQ{x\L)^{L\M)=Q{Vl) I dL^{L\M) , (63) 

-'0 J Liaiaix) 

with the second equality following from Eq. (6). 5(r, M) is the number of galaxies in a halo of mass M observable at comoving 
distance r relative to the total number of galaxies in that halo. The range of S is the interval [0,1], and it has the following 
limiting behaviour: for M ^ Mmin we have lim;^._>o 5(r, M) = 1 and lim^_>oo 5(r, M) = 0; and for M < Mmin we have 
5(r, M) — 0, where Mmin is the minimum halo mass required for a dark matter halo to be able to host a galaxy. Note that 
for a volume-limited survey 5 = constant. 

A further consequence of Eq. (62) is that the effective survey window functions given by Eq. (58) reduce to: 
e["L)(r) = J dMniM)b^iM)N^"\M) ^(r, M)5(r, M)]" . (64) 

Implementing these considerations in Eq. (60), we arrive at the equation governing the optimal weights: 

I [e(i!i)(r)]' + c [g[?!o)(r) + e(2!o)(r)] | |e(qi)(r) + [l + m)NZiM)S{v, M)] | = g[;|i)(r) . (65) 




5.2 The optimal weights 


We now seek a general solution for the weight equation Eq. (65). To begin, we notice that the only part of the weight equation 
that carries any mass dependence is the second bracket on the left-hand side of Eq. (65). If we set the radial vector to a 
constant r = ro, then the optimal weights at fixed position inside the angular mask must have the mass dependence: 
b{M) 


w{ro, M) oc 


1 +P{M)Ni^\M)S{vo,M) 


( 66 ) 


The weights are therefore proportional to the bias of the dark matter halo in which the galaxy is hosted and inversely 
proportional to the factor [1 -|- I5{M)N^\M)S{vo, M)]. Since this last term depends on the galaxy selection function S, the 


MNRAS 000, 1-?? (0000) 








Optimal estimation of the galaxy power spectrum 11 


weight function is not separable in position and mass, as was found by SM14 for the case of optimal weighting of a sample 
of galaxy clusters. Nevertheless, without any loss of generality, we can factor out this part of the weight function from the 
general weight solution: 


■w{r, M) 


b{M) 

1 +P{M)Ni^\M)S(r,M) 


(67) 


where w(y) is a function of position only that needs to be determined. It is clear from the above equation that the term on 
the right-hand side encompasses the whole mass dependence of the optimal weights. If we now reinsert this expression into 
Eq. (65) we see that the weight equation reduces to: 




(r) -I- 5(2,0) 



(r) -I- cw 



( \ 


( 68 ) 


In order to proceed further we need to recompute the effective survey window functions Q functions from Eq. (64) with the 
new weight function Eq. (67). It is straightforward to show that: 


5(1,!)(»■) 
5[?!o)(r) 
5(2!o)(*’) 


= ?2;(r)neff(r) , where rieff(r) = / dMn{M)o{M) 


/■ 


Nf^fM)S{r,M) 


= w^{r) j dMn{M)b^{M)P{M) 

(r) [ dMn{M)b^{M) — 


1 + P{M)Nf^\M)S{r,M) 


~2 

= W 


1 +P{M)N^^\M)S{r,M) 
N^^\m)S{v,M) 


■P{M)Ni^\M)S{T,M)Y 


From the above equations we also notice the useful relation: 
5(po)(r) + 5(2^0) (r) = w(r)5(h)(r) = w^(r)hefi(r) . 


(69) 

(70) 

(71) 


Replacing all these ingredients back into Eq. (68), after a little algebra we find that w has the solution: 


w{r) = l/[c-|-neff(r)] . 


(72) 


On putting together Eqs. (67) and (72), back substituting the constant c = (1 -I- a)/Pi, we arrive at the general solution 
for the optimal weights: 

b{M) 1 


^(r, M) = 


1 -b P{M)Nf^'’{M)Sir, M)] [(1 + «) + n^s{r)Pi] 


(73) 


This expression is the central result of this paper. Before inspecting how these weights behave for a specific case, a number 
of interesting points may be noted. First, if we were able to identify galaxies in a survey whose host halo masses were drawn 
from some narrow range, then the first factor in Eq. (73) would be constant and so the weights would revert back to a scheme 
that structurally resembles FKP, although with a different effective number density. Second, we note that there is no natural 
limit where the above weighting scheme follows that derived by PVP. This dissimilarity emphasises how important an effect 
modifications of the underlying model assumptions can be on the matter power spectrum estimation and optimisation. 

Figure 1 demonstrates how the optimal weights vary as a function of the galaxies host halo mass and redshift. At low 
redshifts, the galaxy selection 5 —>■ 1. For galaxies that are hosted by low-mass haloes, < 1 and so w x b{M). On 

the other hand, for the high mass clusters ^ 1, and w(r, M) oc b{M)/N^\ m). Hence we would expect the galaxies 

in low-mass haloes to be weighted more strongly than those in high-mass haloes, since the bias is rather a slowly evolving 
function of halo mass. At higher redshift, we would expect that this trend reverses, since {5,heff} —7 0 and so the weights 
effectively follow the bias of the host haloes. These trends are exactly what is seen in the figure. For reference, Fig. 1 also 
compares the optimal weights with the original FKP weight function, given by: iCFKp(r) oc [1 -|- n(r)P{k)]~^ The upturn at 
large masses in the left panel, is driven by the mass dependence of the ratio b{M)/N{M). For large masses, b{M) is a steep 
function of mass b{M) oc ® (Seljak & Warren 2004), whereas for most halo occupation distribution models, N{M) oc 
(Zehavi et al. 2011a). Hence, leading to an upturn for large masses. 


5.3 Time evolution of the optimal weights 

Before moving on, we briefly discuss the redshift dependence of the optimal weights in Eq. (73). So far, we have considered that 
h(M), 6(M), <E>(L|M), fV5)(M), P{M) and ^(r) are all independent of time (here we will parameterise time evolution through 

^ Note that in order to evaluate the weight functions we took miim = 22 and adopted the CLF model of Yang et al. (2003) to compute 
(S(r,M) and For the function I3(M) we employed the model presented in Cooray Sz Sheth (2002) derived from semi-analytic 

galaxies. 
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Figure 1. Left panel: evolution of the optimal weights in a fiducial flux-limited survey as a function of halo mass. The thick blue and 
thin red lines represent the optimal weights and the FKP weights, respectively. The solid, dashed and dot-dashed line styles denote the 
results for increasing x? respectively. Right panel: evolution of the optimal weights as a function of redshift for several halo masses. 
Thick blue and thin red lines denote optimal and FKP weights. The solid, dashed and dot-dashed lines show the results for galaxy-, 
group- and cluster-scale halo masses, respectively. We have taken the flux-limit to be miim = 22. 


the comoving distance x)- This is approximately correct if the survey volume is sufficiently small so that these functions do not 
evolve appreciably over the survey. In general, however, they are time dependent. Therefore we would have: fi{M) —^ n(M, x), 
h(M) ^ h{M,x), <E>(i|M) ^ ^ M^\M,x), P{M) ^ p{M,x) and ^(ri - rz) ^ ^(ri - V 2 ,XuX 2 ) = 

r»(xi)r»(X2)C(ri - r2). 

In the last equality we have assumed that the correlation function obeys linear theory, hence the resulting growth factors. 
Working under this assumption, we redehne the Q functions to absorb the growth factors: 

= j dMn{M,x)[D{x)KM,x)rNi^\M,x)[w\v,M)S{v,M)Y • (74) 


Formally this is equivalent to redehning the halo bias parameter: b{M) —> D{x)b{M,x), and we prefer this latter approach. 
Thus, Eq. (73) becomes: 

D{x)b(M,x) 1 


w{r,M) = Y' .... TT--r ’ 

[l + P{M, x) (M, x)5(r, M)J [(1 + a) + Pi meix)] 

where the new effective number density is: 

V«(M,x)^(r,M) 


neff(r) = / dMn{M,x)D^{x)b'^{M,x) 


l + p{M,x)N^"HM,x)S(r,M) 


(75) 


(76) 


6 INFORMATION CONTENT OF GALAXY CLUSTERING 


The ability of a set of band-power estimates of the galaxy power spectrum to constrain the cosmological model can, theoret¬ 
ically, be determined through construction of the Fisher information matrix. Under the assumption that the density held is 
Gaussianly distributed, one hnds that the power spectrum for a given Fourier mode is exponentially distributed about the 
mean power, and that the band-power estimate is x^ distributed (Takahashi et al. 2011). Owing to the central limit theorem, 
in the limit of a large number of Fourier modes per fc-space shell, the power spectrum estimates thus approach the Gaussian 
distribution. Under the assumption that the power spectrum estimator is Gaussianly distributed, it can be shown that the 
Fisher matrix has the form (Tegmark et al. 1997; Tegmark 1997) (but see Abramo (2012)): 




iTr[C-'C,„G-iC,,3] 


^dPj 

^ da 813 


^ da 8/3 


where the approximate equality follows from the fact that the second term on the right-hand side of the first equality 
dominates over the hrst term, since it scales directly in proportion with the number of Fourier modes, whereas the first term 
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is independent of the number of modes. In the above we have made use of the notation djda = djdOa to denote partial 
derivatives with respect to the cosmological parameters 6a. On taking the covariance matrix to be diagonal, as is the case in 
Eq. (51), the above expression for the Fisher matrix becomes: 




a0 


dlogPi— — d log Pj d log Pi d logP* 


dp 


da 


dp 


JV 


(fe* 


(77) 


If we now define the effective survey volume through the expression, 
n3 / \ 2 


VMki) = 


Vi 


I ) 


(78) 


and take the continuum limit for the Fourier modes, we find that the Fisher matrix can be expressed as (Tegmark 1997): 


^•y.B — 


/ 


d^k d\ogP{k)d\ogP{k) 


V.fi{k) . 


(79) 


{2iiY da dp 

Thus in order to determine the information content of the galaxy power-spectrum obtained using a general weight function w, 
we simply need to calculate Ves[w\{k) or equivalently S/JV[w]{k). It is clear from Eqs. (51) and (52) that a general expression 
for the S/JV is given by: 


(fe*) = 




y dV [g(i|i)(r)] |y d\ Qg(J|i)(r)] -f [g(i!o)(r) + g( 2 !o)(r)] 

In the case of the optimal weights from Eq. (75), a little algebra leads to the simplified result: 

2 „ r -f; _ , X n2 


2(2^; 


2(2^ 


/ 


r 


Pt neff(r) 


.(1 -I- a) -I- Pj nefiF(r). 


(80) 


(81) 


The above expression will be useful for forecasting how well a future galaxy redshift survey may constrain cosmological 
parameters, after an optimal power spectrum analysis has been performed. 


7 PRACTICAL CHALLENGES IN IMPLEMENTING THE OPTIMAL WEIGHTS 

In order to implement the optimal weighting scheme, one requires knowledge of: the halo mass function h(M); the halo bias 
function b{M); the conditional probability density <E>(L|M); the first and second factorial moments of the halo occupation 
distribution as parameterised by N^fM) and P{M)\ and a way to associate each galaxy in the survey to a host halo. A 
possible route for achieving this is as follows: 

• Pure halo-dependent quantities: h(M) and b{M). These functions can be determined directly from numerical simulations; 
there also exist a number of accurate semi-analytic fitting functions in the literature (for recent examples see Tinker et al. 2008; 
Crocce et al. 2010; Watson et al. 2013). However, in order to employ these one needs to specify the underlying cosmological 
model - we do not consider this too troublesome, since it is also required to turn redshifts into distances. 

• Galaxy formation dependent functions: ^{L\M), N^\m). These require a model of galaxy formation or 

additional measurements. On adopting a state-of-the-art SAM, these functions can be measured directly (Benson et al. 2000; 
Cooray & Sheth 2002). They may also be obtained from the data through the CLF approach (Yang et al. 2003; van den Bosch 
et al. 2013). 

• Associating galaxies to groups: this step could be performed through application of standard friends-of-friends group 
finding algorithms or more sophisticated colour-magnitude-redshift grouping methods (Eke et al. 2004; Koester et al. 2007; 
Rykoff et al. 2014). 

• Determine group halo mass: through the use of good quality mock catalogues, such as can be facilitated through a SAM, 
one may apply the same grouping algorithms as were used on the real data to the mock data, thus finding the mapping 
between each group and the most likely halo mass (Eke et al. 2004). 

• Implement the optimal weighting scheme and measure P{k). 

Owing to the fact that the steps enumerated above can not be performed without error, it is likely that this will introduce 
additional scatter that we have not accounted for in our optimal estimator. We expect that this scatter will not bias the 
measurements, but will most likely lead to a reduction in signal-to-noise. We shall leave it as a task for future work to explore 
how well this method can be implemented in detail. 


8 CONCLUSIONS 

In this paper we have developed the theory for the unbiased and optimal estimation of the matter power spectrum from the 
galaxy power spectrum. Our approach generalises the original approach of FKP, by taking into account central ideas from 
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the theory of galaxy formation: galaxies form and reside exclusively in dark matter haloes; a given dark matter halo may host 
many galaxies of various luminosities; galaxies inherit part of their large-scale bias from their host halo. 

In §2 we described the generic properties of a galaxy redshift survey and presented a new theoretical quantity: the galaxy- 
halo double delta expansion. We demonstrated how one may use this expansion of the halo and galaxy helds to answer basic 
statistical questions concerning the galaxy distribution. In particular we gave a derivation of the galaxy luminosity function 
in this framework. 

In §3 we presented estimators for the galaxy correlation function and power spectrum. It was proved that, in the large-scale 
and large-survey-volume limits, these were unbiased estimates of the dark matter correlation function and power spectrum. 
We demonstrated that, similar to FKP, in our scheme the matter power spectrum could be obtained by subtracting an 
effective shot-noise component followed by the deconvolution of the power spectrum associated with an effective survey 
window function. 

In §4 we derived general expressions for the covariance matrix of the weighted galaxy power spectrum, including all non- 
Gaussian terms arising from the nonlinear evolution of matter fluctuations, discreteness effects and finite survey geometry 
effects. These results generalise the earlier results of (Meiksin & White 1999; Scoccimarro et al. 1999; Smith 2009). In the 
limits of large-scales, large survey volumes, and Gaussian fluctuations, the covariance matrix was found to be diagonal. 

In §5 we found an equation that governs the optimal weights to be applied to galaxies. We found a general solution of the 
weight equation. Interestingly, the solution did not carry any explicit dependence on galaxy luminosity. Instead the weights 
were found to be simply a function of two variables: the spatial position within the survey and the mass of the dark matter 
halo hosting the galaxies. 

In §6 we presented a new expression for the Fisher information matrix, for a weighted galaxy power spectrum measurement. 
We also presented a formula for the signal-to-noise obtained for the optimal weights. 

Finally, in §7 we outlined the practical steps that would need to be followed if one were to carry out the optimal power 
spectrum analysis. 

In a companion work (Smith & Marian 2015), we explore the signal-to-noise and cosmological information gains achievable 
through the optimal weighting scheme. In a future work, we will also explore how well one may implement such a scheme 
with real data. 
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APPENDIX A: DERIVATION OF THE TWO POINT CORRELATIONS: {N^N'f), {N^Ng), AND {NsNg} 

Let us begin by defining the short hand notation for the correlation: = (ng(r, L, x, M)ng(r', L', x', M')). Following the 

analysis of §2, this correlation may be written: 



s J = 1 



(Al) 
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If we now split the sums over i and j into two parts, a piece where i ^ j and a piece where i = j, then we find 

/ JVh 

(rigrig) = ( — Mj) 

\ i¥=3 

/N^{Mi) iVg(Mj) 

(5°(r - Tfc - Xj)(5°(L - Lfc)(5°(r'- r; - Xj)5°(L'- Li 

fc=l !=1 

' JVh 

+ ( ^ 5°(x - x,)5°(M - M,)5°(x' - X,)5°(M' - Mi) 

/iVg(MO 

h°(r — rfc — Xi)5°(L — Lk)S°{r' — r; — Xi)5°{L' — Li) 


(A2) 


Consider the terms associated with the i ^ j sum, since we have assumed that the galaxy properties hosted by the ith halo 
are independent of the galaxy properties in the jth halo, we may write the average of these terms as the product of the two 
averages. Next, consider the term i = j, and notice that we may also separate the sum over k and I into two terms, a term 
with k ^ I and a term with k = 1. This leads us to write the following expression: 

/ Nk 

(n^n^ = / 5°(x — Xi)5°(M — Mi)5'^(x^ — Xj)(5°(M^ — Mj) 

\ 


Ng(Mi) 

<5°(r-rfe-x,)<5°(L-Lfe 

k=l 


5°(r'-ri-x,)5°(L'-Li: 

1 = 1 


i Nf, 

+ (Y1 - M.)S°{x' - x05°(M' - Mi) 

\ i=j 

Alg(Mi) A^g(MO 

y~^ y~^ (5°(r - rj, - Xi)5°(L - Lfc)5°(r'- Ti - Xj)(5°(L'- Li) 

fc = l l^k 

fNg(Mi) 

+ ( ^ 5°(r - rfc - Xi)5°(L - Lj,)5°(r'- Tfe - Xi)(5°(L'- Lis) 

\ fe=i 

We are now able to compute the expectations over the galaxy populations, and with the help of Eq. (13) we find: 

/ Nk 

(wgrig) = / (5°(x — Xi)h°(M — Mi)(5°(x'— Xj)( 5°(M'— Mj) 

\ 

X ArW(Mi)Ar^i)(Mj)f7(r-x,|MOf7(r'-Xj|Mj)cE>(L|MO-I>(L'|Mj)e(r|L)0(r'|L' 


(A3) 


Nh 


+ (Y. - Mi)6°{3c' - - Mi) 

\ i=j 

X [ TVf )(Mi)f7(r - x,|M,)t/(r' - x,|Mi)$(L|Mi)-l-(L'|Mi)0(r|L)e(r'|L') 
+fV^')(M,)<E>(L|MOf/(r - x,|MO0(r|L)5°(L - L')S°{r - r')] ) , 


(A4) 


where in the above we have used a short-hand notation for the factorial moments of the galaxy numbers: 

OO 

A«(M) = (Ag(iVg - 1)... (iV^ - ; + l)|M) = ^ P(Ag|A(M))iVg(iVg - 1)... (iV^ - i + 1) . (A5) 

JVg=0 

Let us now deal with the averages over the dark matter haloes and let us write the first and second terms in Eq. (A4) as 
(rigUg)^ and (ngng)^. Considering the first term, the expectations may be computed as in Eq. (18), and we find 

Nh „ iVfc 

(rigTig)^ = 51 ]/ ]Q ... jXatj^, Ml,..., MAr^)5°(x - Xi)5°(M - Mi)5°(x'- Xj)(5°(M'- Mj) 

X N^^\Mi)N^^\M,)U{r - 3i,\M^)U{r' - :>^,\M,)${L\M^)^{L'\Mj)e{r\L)e(r'\L') 

= Afh(iVh - l)p(x,x',M,M')iVf>(M)A‘j(^>(M')f/(r-x|M)f/(r' -x'|M')-l>(L|M)$(L'|M')0(r|L)0(r'|L'fA6) 
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The joint probability density functions for the halo centres and masses may be expressed in terms of products of their 1-point 
PDFs and correlation functions. For the case of two-points we have: 

P(xi,Xi,Mi,M2) = p(xi,Mi)p(x2,M2) [1 +^''(xi,X2,Mi,M2)] = -^^^(xi,X2,Mi,M2)] . (A7) 

iVh 

In addition, if we assume that the cluster density field is some local function of the underlying dark matter density (Fry & 
Gaztanaga 1993; Mo & White 1996; Mo et al. 1997; Smith et al. 2007), the cross-correlation function of clusters of masses Mi 


and M 2 , at leading order, can be written: 

f(|xi-X 2 |,Mi,M 2 ) = 6 (Mi) 6 (M 2 )^(|xi-X 2 I) , (A 8 ) 

where ^(r) is the correlation of the underlying matter fluctuations. On using this relation in Eq. (A 6 ) we find, 

(ngrig)^ = h(M)h(M') [1 -f 6 (Mi)fe(M 2 )C(|xi - X 2 I)] Nf'\M)Nf'\M')Uir - x|M)[/(r' - x'|M') 

x$(L|M)<I>(L'|M')e(r|L)e(r'|L') . (A9) 

Returning now to the second terms in Eq. (A4) and following a similar derivation to the first term, we find 

= h(M)iVf’(M)$(L|M)$(L'|M)e(r|L)e(r'|L')f/(r - x|M)f/(r' - x|M)<5°(x - x)S°{M - M') 

+n{M)N^^\M)^{L\M)Q{v\L)U{v - x|M)5°(x - x')5°(M - M')5°{L - L')5^{v - r') . (AlO) 

Following the derivation we may now straightforwardly write down the results for the cases of the cross- and 

auto-correlation of the synthetic galaxy-halo field with the real one: 

(ngn() = a"^h(M)h(M')A'g(M)iVg(M')e(r|L)e(r'|L')[/(r-x|M)17(r'-x'|M')4>(L|M)<E>(L'|M') ; (All) 

{risn,) = a"^h(M)h(M')A'g(M)Aig(M')e(r|L)e(r'|L')C/(r - x|M)I/(r' - x'|M')tE>(L|M)<F(L'|M') 

+a~^n{M)Nf\M)^{L\M)^{L'\M)Q(r\L)Q{v'\L')U{v - x|M)f/(r' - x|M)(5°(x - x')(5°(M - M') 
+Nf^\M)^{L\M)e{r\L)U{r - x|M)<5°(x - yi)5°{M - M')S°{L - L')5°{r - r') , (A12) 


where in the above we have made use of the following short-hand notation: (ngn() = (ng(r, L, x, M)ns(r', L', x', M')) and 
(risn!,) = (ns(r,L,x,M)ns(r',L',x',M')). 

APPENDIX B: THE GALAXY COVARIANCE MATRIX 
B1 Comment on the covariance matrix of the matter power spectrum 

Starting with Eq. (46) and inserting Eq. (43) we find: 

Cov[p(ki),P(k 2 )] =Cov[|Pg(ki)|",|.Pg(k 2 )|"] -Cov[|je'g(ki)|",Rhot] -Cov[|Pg(k 2 )|",Pshot] + Var[P,hot]. (Bl) 
where: 

Cov[|.Fg(kl)|M.Pg(k 2 )|"] = (|Pg(kl)nPg(k 2 )|")-(|.Pg(kl)|^)(|Pg(k 2 )|") ; 

Cov[|.Pg(k,)|",Pshat] = <J|.Pg(k,)|2p,hot)-(|.Fg(k,)|"){Rhot) ;i€{l,2}; 

Var[Pahot] = (Pfiot) - (fahot)^ . 

If we assume that the statistical uncertainties are dominated by |.Pg(ki)p and not Pshot, then we may approximate the 
covariance matrix as is written in Eq. (46). 


APPENDIX C: DERIVATION OF THE COVARIANCE MATRIX OF THE Pg POWER SPECTRUM 

To begin, we notice that we may rewrite the covariance matrix of |Pg(fc)|^, which is given in Eq. (46), as 

Cov[|Pg(kl)|",|Pg(k 2 )|"] = f d"A:3d"fc4<5°(kl-fk3)5°(k2-bk4) [ {Pg(ki) . . . Pg(k4)) - (Pg(kl)Pg(k3)) {Pg(k2)Pg(k4))] . 

(Cl) 

We see that in order to proceed we need the 4-point function of the Pg(k) modes. On transforming to real space, this 
requirement is transformed into the need to determine the four-point correlation: 

{Pg(ki) . . . Pg(k4)) - (Pg(ki)Pg(k3)) {Pg(k2)Pg(k4)) 

= I d\i...dV4 [ (Pg(ri)...Pg(r4))-(Pg(ri)Pg(r3))(Pg(r2)Pg(r4))]e*i-'i+-+‘^"-^" (C2) 
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Cl Computing the 4-point correlation function of J^g(r) 

Since the terms (J^g(ri)J^g(rj)) are given by Eq. (27), we are left with the task of computing the 4-point correlation function 
of the field J^g(r). Using our relation Eq. (8), this is given by: 


(7-g(ri)...J-g(r4)) = 


dLi(f XidMiw{ri,Li,x.i,M^) 

X [ng(ri,Li,Xi,Mi) - ans(ri, LiXi, Mi)]... [ng(r4, L4, X4, M4) - ans(r4, L4, X4, M4)]^ 

I J dLid^XidMiw{ri, Li,Xi, Mi)| | (ng,i ... ng, 4 ) - a ^ (rig,mg, 2 ng, 3 ns, 4 ) + Scycj 


+a^ 1^ (rig,irig, 2 ns, 3723 , 4 ) -I- 5permj — (rig, 1723 , 2723 , 3213 , 4 ) + Scycj -I- (723,1 ... 723 , 4 ) j (C3) 

with the short-hand notation identical to that used in Appendix A: ng,i = n^{ri, Li,Xi, Mi) and 72s,i = 72s(ri, Li, x^, Mi). 
Focusing on the first term in curly brackets on the right-hand-side, and if we insert our galaxy-halo double delta expansion 
we find: 

(rig,! .. .ng,4) = / 5°(x'i - XiJ5°(M( - MiJ .. .(5°(xl - XiJ5°(Mi - Mi^) (g)\ , 

\7l,»2,»3,74 = l / li 

where we have introduced the short-hand notation 

Y, - rii)5°(Li - LiJ .. .^°(r4 - - L^J 

\jl7i2 7J3«J4 = l 

As was done for the case of the two-point function, we may now split the sum over haloes into five types of terms: 

(n;,i...r2;,4) = (Fi) + (r2) + (Fs) + (r4) + (Fs) , 

where the terms Fi are defined: 

Ti = X]<5°(x'i-x,,)...<5°(xl-XiJ5°(M(-Mi,)...<5°(Mi-MiJ(g) ; 

il 7^*2 7^43 

4 

F 2 = y] <5°(xi-Xii)5°(x2-Xi2)5°(M(-Mii)(5°(M2-Mi2) P[ |5°(Xp-Xip)5°(Mp-Mip)| (g) + 5 perms ; 


(C4) 


(C5) 


(C6) 




p=3 


z 4 

Fa = y] tl |5°(xp - Xip)5°(Mp - Mij,)| P[ |5°(xg - XiJJ°(M'- Mij| (g) -I- 2perms ; 


il=i27^^3=^4 P=1 


q = 3 


F 4 = 


F 5 = 


4 

Y -XiJ5°(M( - MiJ P[ |5°(xp - Xij,)(5°(Mp - Mi^)^ (g) + Sperms ; 

il7^i2=^3=M P = 2 

4 


(C7) 


* 1 = 12 =® 3=*4 P =1 

Computing (Fi): Integrating over the Dirac delta functions and relabelling primed variables to unprimed, we write: 

Nh 

(ri) = E p(xi,...,X4,Mi,...,M4)(g) 

= Ah(A^h — l)(A^h — 2 )(Ah — S)p(xi, Ml).. .p(x4, M 4 ) |j 1 + {?i2 + C13 + Ci 4 + ^23 + ^24 + ^34} 

+ {C123 + C234 + C34I + C412} + {^12^34 + ^13^4 + ^14^23} + ’7l234j (g) 

« hi ... fi4 [ 1 -b {Cl 2 + 5 perms} + {Clas + S perms} + {^12^34 + 2 perms} + r?} 234 ] (g) , 

where in the above we have decomposed the joint 4-point PDF into its respective 1-point moments and set of correlation 
functions. In this case (} and r; denote the connected three- and four-point correlation functions, respectively. Note also that 
we used the following short-hand notation: 

= r(xi,Xj,Mi,Mi) ; crjfe = C(xi,Xi,Xfc,Mi,Mi,Mfe) ; 77 ^*,; = 77‘^(xi, Xj, x*,, x;. Mi, M^, Mfc, Mj). (C8) 

Computing (Fa): We denote <5°^ = 5°(Mi — Mj)S^{xi — x^) and S^ij = 5^{Li — Lj)5^{ri — rj). Taking the expectations 
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Oh,23 


and integrating over the delta functions we find: 

JVh JVh 

(r2) = ^p(xi,X2,X3,Mi,M2,M3)<5h,34 (s) + Ml, M 2 , M4)S°23 id) 

Nh iVh 

+ X!^(^l’^3,X4,Mi,M3,M4)<5h,12 (5) + X]p(^l’^2,X4,Ml,M2,M4)5h,i3 (ff) 

n=^27^*3/*4 n=i37^^27^^4 

+ ^p(xi,X 2 ,X 3 ,Mi,M 2 ,M 3 )< 5 h ,14 (5) + X]p(^l’^ 2 ,X 3 ,Ml,M 2 ,M 3 ) 5 h ,24 (fl) 

n=U7^*27^*3 ^2=U7^n7^^3 

= A^h(Afh - l)(fVh - 2 ) |^p(xi,X2,X3, Ml, M 2 , M 3 ) (g) 5 °,34 +p(xi,X 2 ,X 4 , Mi, M 2 , Mi) (g) 

+ p(xi,X 3 ,X 4 ,Mi, M 3 ,M 4 ) (g) ( 5 h ,12 +p(xi,X 2 ,X 4 ,Mi,M 2 ,M 4 ) (g) Sh,l 3 +p(xi,X2,X3,Mi,M 2,M3) (g) 5 h ,14 
+ p(xi,X2,X3,Ml,M2,M3)(sr)5h,24] 

~ ninsfll [1 + ^13 + ^14 + ^34 + C134I (5) 5h,12 + filfi2fi4 [1 + ^12 + ^24 + ?24 + C124] {<?) '^h,23 

+ nih2n3 [1 + ^12 + ^13 + ^23 + C123I (5) 5h,14 + fiin2n4 [1 + ^12 + 5 i 4 + ^24 + C124] (<?) '^h,23 

+ nih2n3 [1 + ^12 + ?23 + ^31 + C123] (5) ^h,24 + nin 2 n 3 [1 + C12 + C23 + ?31 + C123] {9) '^h,34 • 

Computing {r 3 ): Again, on taking the expectations and integrating over the delta functions we find: 

(rs) = ^p(xi,X3,Mi,M3) {g) 5 ^, 125 ^, 3 A + '^P{^ 1 ,^ 2 ,Mi,M 2 ) {g)S° 136 ° 24 + ^P(xi,X 2 ,Mi,M 2 ) (s) 5h,14'5h,23 

il='i2 7^^3=*4 n=i3/^2=U n=i4 7^^2=*3 

= A‘h(iVh - 1) |^p(xi,X 3 ,Mi,M 3 ) (g) ( 5 h,i 2 5 h ,34 + p(xi, X2, Mi, M2) (g) 5 h,i 3 ( 5 h ,24 + p(xi, X2, Mi, M2) (g) ( 5 £',i 4 5 h. 23 ] 

= hlfl 3 [1 + ^^ 3 ] (g) S^^12 St,34 + nin2 [1 + ^f 2 ] (g) <5h,13 St,24 + 5h,14 5i?,23j ■ 

Computing (Fi): Again, on taking the expectations and integrating over the delta functions we find: 

(Fi) = ^ p(xi, X4, Ml, Ml) (g) 5 ?,12 5 ?.i 3 + ^ p(xi, X3, Mi, M3) (5) 5 ?,12 5 ?,14 

21=72=^37^14 21=22=247^23 

+ ^ p(xi, X2, Ml, M 2 ) (g) St,13 5h,i4 + p(^i’ ^2, Ml, M 2 ) (g) 5 °23 St ,24 

*1=^3=M#«2 *2=^3=^4?^n 

= N{N - 1) [ p(xi, X 4 , Ml, Ml) (g) St,i 2 < 5 °i 3 + p(xi, X 3 , Mi, M 3 ) (g) 5 h.i 2 5 h,i 4 
+ p(xi, X 2 , Ml, M 2 ) {g) 5 h,i 3 5 h,i 4 + P(xi, X 2 , Ml, M 2 ) {g) St ,23 St ,21 ] 

= hlh4 [1 + ^14] (g) St ,12 < 5 i ?,13 + hlh3 [1 + C13] (s) ^h,12 5h,14 + nin 2 [1 + ^^2] (p) ['^h*,13 <5h,14 + St,23 <5?,24! 

Computing {F 5 ): Again, on taking the expectations and integrating over the delta functions we find: 

(Fs) = P(xi, Ml) { 5 ) (5h_12 5h,13 ^h,14 = ni ( 9 } St,12 St,13 St,14 

11—12—*3—^4 

Collecting the terms (Fi), (F 2 ), (F 3 ), (F 4 ) and {F 5 ), we write: 

(ng,ing, 2 ng, 3 ng, 4 } = hih2h3h4 |[1 + ^12 + Ci3 + ?i4 + C 23 + C 24 + ?34 + C 123 + C 124 + C 134 + C 234 + ^ 12^34 + C 13 C 24 

+ ?14^23 + ^ 1234 ] ( 3 )+ [1 + C 13 + Cl4 + ?34 + Ci34]M- ( 5 )+ [1 + Cl 2 + ?24 + ^41 + Ci 24]M— (g) 

412 n3 

+ [1 + Cl 2 + C 13 + ^23 + C 123 ] — ( 9 } + [1 + C 12 + ^14 + C 24 + C 124 ] (g) 

ni 713 

+ [1 + ^12 + C 23 + ^31 + C 123 ] — { 9 ) + [1 + C 12 + C 23 + C 3 I + C 123 ] — ( 5 ) 

724 72-4 

cD rD cD rD cD rD cD rD 

, ri , cc i‘^h,12<3h,34 / \ , ri , cc i‘^h,13‘^h,24 / \ , ri , cc i^h,14'^h,23 / \ , n , cc i^h,12‘^h,13 / \ 

+ [1 + ^ 13 ] - - {gn [1 + ^ 12 ] - - ( 5 )+ [1 + ^ 12 ] - - w+ [1 + gi4] - - { 9 ) 

712714 7lz7l4 71^714 7l27lz 

cD cD eD cD cD cD cD cD cD '] 

, i<^h,12'^h,14 / \ , ri 1 cc i<^h,13<^h,14 / \ , fi i cc i'^h,23<^h,24 , v , <^h,12<^h,13^h,14 , vl 

+ 1 + 63 - - (5) + 1 + 62 - - (5) + 1 + C12 - - (ff) H— - _ {g)) ■ 

712714 773714 773774 772773774 I 

Based on the above expression we are now in a position to immediately write down the other 4-point function cases required 
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to compute Eq. (C3): 


(^g,l^g, 2 U.g,3723^4) — 


a 711712713714 ^ (1 + ^12 + Cl 3 + ^23 + C 123 ) (s) + (1 + C 13 ) “T— {q} 


+ (1 + ^ 12 ) 


rD cD 

^h,23 / V , ^h,13 / \ 

(ff) + (fl) 


na 


nz 


rD rD 

n2nz ' 


(^g,l^g,2'^s,3^s,4) 

(rig,ms,2ns,3^3,4) = a“'^nin2n3n4 |(5) + a 

(ris.i^s,2ns,37^5,4) = a“^nin 2 n 3 n 4 < (p) + a 


cD rD cD cD 

_2- - - - J/1 , \ / \ , ^h,12 / \ , /I , \ <^h,34 / \ , <^h,12 <^h,34 / v 

a 71i7l27l37l4 (1 + ^ 12 ) (g) + (g) + « (1 + ^ 12 ) (g) + Q _ _ (g) 


712 

rD cD rD 

Oh,23 Oh,24 Oh,34 


7^3 
Oh,12 


n4 


n4 


Oh,23 


724 722724 

rD rD 

/ V , Oh 230h,24 / V 

( 9 ) +a ’ _ ’ (g) 

723724 

fD rD 

Oh,14 


Oh,24 


(C9) 

(CIO) 

(Cll) 


^ (g) + ^ (g) + ^ ( 3 ) + ^ (ff) + ^ (9) + (g) 

722 723 723 724 724 724 


I 

+ Qi 


rD rD 
0h,120h,34 


(g) + 

rD rD 
OVi Q.'^Oi 


rU rU 

0h,130h,24 


rD rD rD rD 

/ V , Oh 230h,14 / V , 0h,i20h,13 / v , 

( 9 ) + ^ ^ { 9 ) + ^ (g) + 


723724 


722723 


722 724 


^(g) 


rD rD 

, Oh,l30h,l4 / V , ^h,23^h,24 , \ 

+ - - { 9 ) + - - { 9 ) 

723724 723724 


rD rD rD 

, 3 0h,120h,130h,14 / \ i 

+ Ok - - - {9) ) • 

722723724 ' 


(C12) 


In order to compute the 4-point correlation function of J^g, we need to compute the sum of 4 terms of Eq. (Cll), with 
permuted location of g and s. This is given by: 


— (72g,i72s,272s,372s,4) H“ (72s,l72g^272s,372s,4) T (72s,l72s,272g,3?2s,4) H~ (72s,l72s,272s,372g,4) 

^D ^D ;:D xU 


— 72i722?23724 i 4(3; (g) 2 a 


rD 

Oh,23 


rJJ 

/ V , Oh,24 / V , ^h,34 / V , 

^ ( 5 ) + — (g) + — (g) + ^ 

723 724 724 723 


^h,13 


(s) + %^ (») + ^ (9) 

724 722 


+ Qi 


rD rD rD rD rD rD rD rD 

Oh,23 Oh,24 / \ , Oh 13 Oh 14 . , Oh 12 Oh 14 , . Oh 12 Oh ,13 , . 

( 9 ) + (g) + (g) + (g) 


723 724 


723 724 


722 724 


722 723 


We also need to compute the sum of 6 terms of Eq. (CIO), with permuted location of g and s. This is given by: 
1/2 = (72g,i72g,272s,372s,4) + 5 permS 

ap.. Tap., ap./ 

(g) 


(5^ 

nin2n3n4a ^ (6 + ^12 + C 13 + ?i4 + C 23 + C 24 + ^ 34 ) (g) H—E— (g) + 


rD rD 

Oh,13 Oh,23 

fiz 


723 


724 


724 


+ 


jtD 

^h,; 


724 


(g) + a 


(1 + C12) 


^ + (1+Cw)^+ (1+6^3)^ 

724 724 724 


<g) 


+ 


rD rD 

/i I \ Oh,23 / \ I /I I /-c \ Oh 13 

1 + ^ 14 ) g + 1 + 64 

723 723 


Oh,12 Oh,34 


(g)+(i + e34)^(g)+2- ^ 

722 722 724 


(g)+ 2 


rD rD r 

Oh,13 Oh,24 Oi 

ns n4 


'h,23 Oh,14 
723 724 


(g) 


In addition we need to compute the sum of 4 terms of Eq. (C9), again where the locations of g and s are permuted. This sum 
can be written: 


Z/3 = (?igu?ig^2/ig,3ns,4) + (iigung^2?i3,3ng^4) + (ng,ins_2iig,3ng_4) + (risking,2?ig^3ng, 4 ) 

= a ^hih 2 ri 3 h 4 |(4 + 2^12 + 2^13 + 2^23 + 2^14 + 2^24 + 2^34 + C 123 + C 124 + C 134 + C 234 ) (g) 


JfD 


JfD 


.fD 


+ (1 + Cu) ^ (g) + (1+c^2) (g) + (1 + eh) (g) + (1 + eh) ^ (g) + (i+eh) ^ (g) 

712 723 ns 712 724 


+ (1 + eh) ^ (g) + (1+eh) ^ (g) + (i + eh) ^ (g) + (i + eh) ^ (g) + (i+eh) ^ (g) 

n4 ns n4 n4 ns 


Oh,24 


Oh,34 


+ (i + eh)-^—(g) + (i+eh) 

n4 n4 


( 5 ) + 


/ V , Oh, 12 Oh,14 . V , 

\9) H=-=— \9) + 

712 723 712 724 ns n4 


Oh,12 Oh,13 


Oh,13 Oh,14 


/ V , Oh,23 Oh,24 / V i 

^ 9 ) + —^ -i— { 9 ) ^ • 

ns n4 


Collecting the terms Li, L 2 and L 3 along with (ng 4 ng, 2 ng, 3 ng, 4 ) and {ns,ins, 2 ns^ 3 ns, 4 ) and inserting them into Eq. (C3), and 
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Optimal estimation of the galaxy power spectrum 21 


after some algebra we arrive at the following arrangement: 

(Fg,i ... Fg,4) = ^ I ?7^234 (fl) + 


(l + a).D 

S12 n-=-<^h,12 

n2 


C 34 + 


(1 + a) 


+ 


^13 + 


(1 + a) 


ns 


"h,13 


(1 + a), 


^24 + —=-^^h,24 

n4 


5v, 24 {g) + 

rD cD cD rD rD 

, .c ‘^h,12 / \ , >c ^h,23 / \ , AC ^h,14 / \ , >c <^h,13 / \ , ac ^h,24 / \ , ac ^h,34 , \ 

+Cl34^ \g) + Cl24^ \g) + Cl23^ \g) + Cl24^ \g) + Cl23^ (5) + Cl23^ (<?/ 

712 T^Z 77.4 7lz 714 7 I 4 


fc (1 + a) D 

S14 "I-=-<Jh,14 

724 


724 
?23 + 


"h,34 


( 5 ) 


(1 + a) 


ns 

xO 

Oh 


+?13 


CJJ CJJ 
c '^h,12<^h,34 


722724 


{ 9 ) + ^12 


cD cD 
c <^h,13 ^h,24 


cD rL 
^h,14^h,23 


rU rJJ 
c ^h,12^h,13 


(g) + ei 2 (g) + ei 4 — 

723724 723724 722723 


(g) + ^13 


cJJ rJJ 
c ^h,12^h,14 


722724 


(g) 


(g) 


I ^h,13'5h,14 / \ I tc '^h,23'5h,24 / ^ , (1 + Q!^)^h,12 ^h,13'^h,14 , \ 

+?i2 - - {g) + Ci2 - - {g) + - - - (g) 

nsn4 713714 712713714 


where in the above we used the short-hand notation = J^g(ri). In order to compute the covariance we also require the 
second term in Eq. (C2). On repeatedly using Eq. (27) we find that this can be written: 


(Eg,iEg,3) (Eg,2Eg,4) = |y^ dLi(f XidMimWj 


Ois (g) + 


(1 -I- a)( 5 j (’43 


713 


(g) 


\ , (1 + Oi)^h,24, I \ 

64 0 ) +-^ (5) 


714 


Joining the last two equations, we write the covariance as: 


(^g(ri)^g(r2)^g(r3)J'g(r4)) - (^g(r4)^g(r3)) 11 {/ dLid^XidMi TliTUij |77 i234 {g) + 


-b 


?^2 + ^i^4°i2 


712 


^34 + 


(1 + a), 


714 


“h,34 


( 5 ) + 




724 


C 23 + 


(l-ba) 


ns 


Oh,2S 


(5) + Ci34%^ (g) 

ns 


rD rD cD rD cD cD cD cD cD 

, .c ^h,23 / \ , AC ^h,14 / \ . .c ‘^h,13 / \ , ac ^h,24 / \ , ac <^h,34 / \ , ac <Jh,12<Jh,34 / \ , ac <Jh,13^h,24 , v 

+C124 Kg) + C123 \g) + Ci24 \g) + C123 \g) + C123 \g) + Os — \g) + ^12 ^— (5) 

723 724 723 724 724 722724 723724 


cD cD rD rD rD rD /i _i_^3\rD rD rD 

, -ii,x^-n,zcs/ \ , AC -n,i^'^n,ics/ v , ac ‘^h,12^h,14/ v , ^h,13‘^h,14/ v , ^h,23^h,24/ v , U i20h,i30h,i4 , v 

+^12 ^— \9n 64 ——w)+ ^13 ’ ^ (g)+ ?i 2 ’ ^ (g)+ ^12 ^ (g)+- ^ ^ ^ ( 5 ) 


rD rJ- 

ac ^h,14^h,23/ 


rjj rL 

ac ^h,12^h,13/ 


723724 


722723 


722 724 


723724 


723724 


722723724 


(C13) 


We now make the assumption that the fluctuations are close to Gaussian, hence we take g = g = 0. 


(F,(ri)F,(r2)F,(rs)F,(r4)}-(F,(ri)F,(rs)) (F,(r2)F,(r4)) ^ [ dUd\,dM, 

i = l 


cc I (1 + a) D 

S12 H-^1 


712 


'es4+^-^s? 


724 




rD rD rD rD 

<^h, 12 <^h ,14 / \ , ^h.l 2 ^h ,34 / v 

^— {g) H— - - {g) 

712714 712714 


h,34 


+ Cl2 


{g) + 


(1 + a) D 

S14 H-=-^h .l 


724 


^23 + 


i TliW. 

(1 + a) D 

-^-"h, 

ns 


(1 + a )5h,l2^h,13^h,14 


712713714 

rD rD 
c <^h,12^h,13 


{g) 


{g) + 04^ 


712713 


(5) 


rD rD rD rD rD rD rD rD 

Oh,13^h,14 / \ , ^h,13^h,24 / v , ^h,23^h,14 / v , <3h,23^h,24 / v 

- - {g) + - - {g) H— - - {g) + _ _ {g) 

713 714 713 714 713 714 713 714 


(C14) 


On taking the limit that ^ 1, the first and last three terms will be sub-dominant (Smith 2009). Using the linear bias 
model from Eq. (A8), we write 


(J'g(ri).Fg(r2)J-g(r3)J-g(r4)) - (J-g(ri) J'g(r3)) (7^g(r2) J-g(r4)) = ^t[\ ( dUd\,dM, 

i=l 


, , , I (1 + a) rD 

0l02?12 H-r-Oh.12 


712 


1 , h I (1 + a) nD 

03O4C34 H-z-Oh,34 

714 


{g) + 


, 1 t- I (1 + a) -D 

3i 04?14 H-z-Ojj 14 

714 


i niWi 


U U C I (1 + a) rD 

0203?23 H-z-Oh,23 

ns 


{g)\- (CIS) 


C2 Averaging over the galaxy distributions 

Let us now return to the evaluation of the expectation values for the galaxy population. Consider again Eq. (CIS) and let us 
look in particular at the terms Kg) and any premultiplying Dirac delta functions. For compactness, we shall use the following 
definition: 

f{p\q) = e(rp|Lp)$(Lp|M,)t/(rp - x,|M,) (C16) 
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22 Smith & Marian 


We find that there are three types of terms forming the individual (g) factors: 


( 5 ) ^ n {K^pf(P\P)} ’ 


p = l 

3 


(5) 


<5h*.12<5h,34 (g) 


P^l 




/(4|1) + <5^14 


pe{i,3} 


K 


( 2 ) 




V(2|1) + 5°i2 


g.3 




(iy/(4|3) + 5g,34 

g,3 


rD fD 
^h,12<Jh,34 • 


(C17) 


The rest of the (g) factors can be worked out in a similar way. Embedding them into Eq. (C15), and using Eqs. (Cl) and (C2), 
we arrive at the expression for the covariance of the power spectrum estimator in the Gaussian approximation: 


Cov[|J-g(kl)|Mj-g(k2)|^] = 


r j3 

J ( 27 r )3 ‘l)G'(i.i)(k 2 , —q) + (1 + a) [G( 2 ,o)(ki + k 2 ,0) + G(ki, k 2 )] 


+ 


f (27r)3 (1^1 5 ci)G'(i,i)(—k 2 , —q) + (1 + a) [G( 2 ,o)(ki — k 2 , 0) + G(ki, —k 2 )] 


(CIS) 


where we have defined two more functions: 


Gp,^)(k,q) = j dMn{M)b’^{M)N^^\M)Wll-,{k,q,M) ■, 

G(ki,k 2 ) = I dMn{M)M^\M) I ^vy('i)(ki,q,M)W('()(k 2 ,-q,M) . 


(C19) 


C3 The covariance matrix in the large-scale limit 

In the large-scale limit, the profiles of the galaxies behave like Dirac delta functions, e.g. U{r — x|M) — >■ 5°(r — x). It is 
straightforward to show that in this limit the above-defined functions become: 

Gp,„)(k,q)^g«^)(k-fq) ; G(ki, ka) ^ 5^% (ki + k2), 

where are the Fourier transforms of the functions dehned by CITE later. With these changes, Eq. (CIS) can be expressed 

as: 

Cov[| J•g(kl)|^ I J-g(k 2 )n = I ^P(q)g« 4 )(ki + q)g« )(k 2 - q) + (1 + a) [4\>o)(ki + ^ 2 ) + GIZ)^^! + k 2 )] 

/ + ‘l)e(iq)(-k 2 - q) + (1 + a) [4^.^o)(ki - k 2 ) + e[i%)(ki - k 2 )] 

which is exactly Eq. (47). This concludes our proof of it. 


APPENDIX D: SHELL AVERAGING THE COVARIANCE MATRIX OF THE Pg POWER SPECTRUM 


A reasonable approximation when computing the shell-averaged power, is that if the shells are narrow compared to the scale 
over which the power spectrum varies, one can factor the latter out of the integrals in Eq. (50), writing: 


Cov[|Pg(A:0lMPg(A:2)l' 


" = 2P"(fcO/^ ^Q|J;i|i.i)(ki+k2)Qg;ij4,4)(-ki-k2) 


IV-i ^3 
43 7 


+4(l + a)P(fcO ^S[;;i|i,i)(ki+k2)Q[?|o)(-ki-k2) 

+4(l + a)P(fe)^ ^^^(k4+k2)Qg|o)(-ki-k2) 


Q^^|o)(ki+k2)Q[^|o)(-ki-k2) 


+2(1+ ^ ly ^2|?|o)(ki+k2)Q'?|o)(-ki-k2) 

+ 2 ( 1 + «)^/^ ^ ly ^2(2|o)(ki+k2)Qg|o)(-ki-k2) . 


(Dl) 
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Our task now is to solve the integrals forming the terms of Eq. (Dl). In general, this integrals have the form: 


Jv. Vi iv 


d ^2 Q(ni , 71 - 2 ) 
Vi ^3 




^^ik2'(ri—r2) 


^ r d^kl f d^k2 f jS .3 ^(ni,n2) / \ i(kl+k2) {ri-r2) 

= [ d^r-id^roOV^’"^'> f ^ »i<:i.(ri-r2) f 

= j - r2|)i^(fc,|ri - ral) 

= J d^r2iMkir2i)Mkjr2i)T.VffJ^f^^^^^^^^^ 


In the above, we have defined the shell-average of the spherical Bessel fnnction as 


joikir) = — 


I 


ki+Ak/2 


ki-Ak/2 


dkikiA'n:jo{kir) . 


(D2) 


(D3) 


To obtain the last line of Eq. (D2), we made a change of variables r 2 i = r 2 — ri, and defined the correlation function of the 
weighted survey window function to be. 


y(ni,n2)(n[,n2) , 

(q,/2 1^1 .^2 ^ 



d\iQ 


(ni,n2) 
(Zi,Z2|mi,m2) 


(ri)Q 


(n'i,n^) 

(Z^ ,^2 \'>^i 


(r 2 i -f ri) . 


(D4) 


In the limit that the survey volume is large, the weighted survey window correlation fnnction is very slowly varying over 
nearly all length scales of interest, and so can be approximated by its value at zero-lag. Using the orthogonality relation of 
the Bessel functions, drOja{ur)jc,{vr) = {-k/ 2v3)5^(u — v) we write: 



d'fc2.( 


Vi 


n{rii,n2) 


^,^,)(kr + k2)Q|"^;7=I, 


" 2 ) 


(-ki - k2 


OOf_y(-n-l,n-2){n'i,rL2) 

{Zi ,Z2 |mi ,m2)(Z^ ,Z2 |m'j^ ,7712) ' ' 


(D5) 


We shall now apply this result to the six terms of Eq. (Dl) and write for each of them: 


y7(l.l)(l,l) /Q'l _ 

f [Q|i;iji,i)(r)] = 

' f d'^'^[G(l\)iO] ; 


(D6) 

^(1.1)(2) /Q\ _ 

/ '^'^2(M|i.i)W2(qo) 

(r) = j d\ [e^(i\\)(r)] 

r^(M)W; 

(D7) 


J d ?’Q(i’i|i^i)(r)Q(2|o) 

(r) = J d^r [^g[J_\j(r)] 

^(2,o)(»’) ; 

(D8) 

w(2)(l) /qn _ 

/ C^®^2(1|0)W2(2|0)(»’) 

= J d^rglfofr)glf>^^^ 

,(r); 

(D9) 

y7(2)(2) /qn _ 

Jd^r [Q|2)^,(r)]' = | 

d^'r [e(qo)('’)] ; 


(DIO) 

v(l)(l) ('Q'l — 

jd^r [Q[2|0)(r)]' = / 

d^r [e^(2\o)(i’)] 


(Dll) 


Finally, pntting together all these terms we write our final expression for the shell-averaged covariance as: 


Cov[|.F,(fc7)|M.E,(fc,)|^] 


2^)3 _2^ 
Vi 


{ki)S^j J 


+ 


(i_+q) 

P(ki) 



which is in fact Eq. (51) from the main text. 


APPENDIX E: FUNCTIONAL DERIVATIVES 

In order to compute the functional derivatives of J\f and D making up E[w], we must first work out the functional derivatives 
of the functions Q and the normalisation A. 
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El Functional derivatives of the Q functions and normalisation A 

For small variations in the path of w we find that the functional derivative of (J can be written: 

[w + H = JdMn{M)b(M)M^'> (M) j dL$(L|M)e(r|L) [w;(r, L, M) + Sw{y, L, M)] = M + [w]; 

= JdMn{M)b{M)N^^\M)JdL^{L\M)Q{r\L)Sw{r,L,M) ■ (El) 

+ H = y’dMh(M)iVf)(M)|y'dL$(L|M)e(r|L)[M;(r,L,M) + 5tc(r,L,M)]| = ealo)M + 

5g|i|o)H = 2 JdMn{M)N^^\M)Wi{r,M) J dL^{L\M)e{r\L)5w{r,L,M) ■, (E2) 

+ H = JdMn{M)N^^\M)J dL$(L|M)e(r|L) [«;(r, L, M) + h«;(r, L, M)]" = gglo) H + H ; 

delalojM = 2 JdMn{M)N^^\M)JdL^{L\M)e{r\L)wir,L,M)5w{r,L,M). (E3) 

In the above we have neglected the terms containing [dm]™ with n ^ 2, and we have used a similar definition to Eq. (58) and 
defined: 

W, (r, M) = W, (r, M) . (E4) 

Again, for small variations in the value of w, the functional derivative of the normalization constant A can be written: 

A[w + Sw] = Jd^r (g[^^i^[w + Sw]^ = Jr (g[l\y[w] + Sg[l]-^^[w]^ =jd^rp^(i\~){r)^ + 2jd^rg[^^^gr)Sg\l]i-)[w] 


= A[w] + 5A[w] , 

MH = “2 j d^^’0(M)(r) j dMn{M)b{M)N^^\M)jdL^{L\M)Q{v\L)5w{r,L,M) . 


(E5) 


E2 Functional derivative of Af [w(r, L, M)] 

Consider Eq. (56), we may write the functional derivative as: 


dAflu)] = 


2 I (r)] + c (r) + (r)] |{2£;g^g^ (r)deg\\) H + c M + 5£^g,o) M]} ■ 


Using the functional derivatives of Eqs. (El), (E2), (E3) to calculate the terms in the parenthesis on the right-hand side, we 
obtain the functional derivative of the numerator A/”: 


<5^M = 4 1 d\dMdL I (^[sg\^(r)] V c [egg)(r) + egg)(r)]) n(M)fVg) (M)$(L|M)e(r|L) 

X [e;g\j (r)&(M) -b Wi (r, M)Arf) (M)/Arg> (M) -b u;(r, L, M)] j 5w(v, L, M). 


(E6) 


E3 Functional derivative of T> 

Since T) = A?, we have 5T>\w\ = 2A 
given by 


w(r,L,M)] 

w](5A[w]. Using the functional derivative in Eq. (E5), the functional derivative of T>[w\ is 


5V\w\ = 4AHydVdMdL|gg_\j(r)h(M)6(M)Arg)(M)<E>(L|M)e(r|L)|5w(r,L,M) . (E7) 
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